5-6 is the average, most studies fall between 4 and 8
Levels should span the realistic range of possible values
E.g., include price levels close to min and max of market prices
Use qualitative research and pretests to decide on attributes and levels
Stages 1 & 2 example: VR Headset
Standalone:
Yes vs no
Cellular network access:
Yes vs no
Price:
$400 vs $800 vs $1,200
Battery life:
4 hours vs 8 hours vs 12 hours
How many possible profiles are there
2 attributes with 3 levels and 2 with 2 levels?
\(3 \times 3 \times 2 \times 2\)
\(= 36\)
Stage 3: Create product profiles
Data collection needs simplification (why?)
Shrink total set to only include possible/realistic products
No impossible combinations
Unrealistic products can ruin our data
Stage 4: Obtain consumer preferences for profiles
Rating
Consumers see a set of profiles, rate each
Positives?
People are used to rating things
Simple
Straightforward
Negatives?
Not a choice! (Not economically relevant necessarily)
Ratings are not necessarily comparable across products
Ranking
Choice
Stage 4: Obtain consumer preferences for profiles
Rating
Ranking
People see all choices and rank them
Positives?
Simple
Straightforward
Negatives?
Still not a choice
What if things aren’t evenly spaced?
Choice
Stage 4: Obtain consumer preferences for profiles
Rating
Ranking
Choice
People see options (usually 2+ at a time) and choose
Positives?
Real!
Potentially simple
Negatives?
Need a lot of trials
Stage 5: Data analysis
Typically, conducted at the individual level
Recovers a set of utility parameter estimates for each individual
Mode of analysis dependent on response format:
Ratings data — linear regression (OLS)
Stage 5: Data analysis
Read in these responses to a conjoint survey for tablet computers
responses_DF <-read.csv("respondent_data.csv") # survey reponses ("Y" variables)N <-nrow(responses_DF) # number of subjectssummary(responses_DF)
respondent_id profile_1 profile_2 profile_3 profile_4
Min. : 1.00 Min. :1 Min. :1.000 Min. :1.00 Min. :1.00
1st Qu.: 39.50 1st Qu.:3 1st Qu.:4.000 1st Qu.:2.00 1st Qu.:1.00
Median : 73.00 Median :4 Median :5.000 Median :3.00 Median :2.00
Mean : 74.17 Mean :4 Mean :4.675 Mean :3.07 Mean :2.14
3rd Qu.:113.75 3rd Qu.:5 3rd Qu.:6.000 3rd Qu.:4.00 3rd Qu.:3.00
Max. :145.00 Max. :7 Max. :7.000 Max. :7.00 Max. :7.00
profile_5 profile_6 profile_7 profile_8 profile_9
Min. :1.000 Min. :1.00 Min. :1.000 Min. :1.000 Min. :1.000
1st Qu.:3.000 1st Qu.:2.25 1st Qu.:1.000 1st Qu.:2.000 1st Qu.:1.000
Median :4.000 Median :4.00 Median :2.000 Median :3.000 Median :1.000
Mean :3.544 Mean :3.86 Mean :2.193 Mean :2.974 Mean :1.789
3rd Qu.:4.000 3rd Qu.:5.00 3rd Qu.:3.000 3rd Qu.:4.000 3rd Qu.:2.000
Max. :7.000 Max. :7.00 Max. :7.000 Max. :7.000 Max. :6.000
profile_10 profile_11 profile_12 profile_13
Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000
1st Qu.:7.000 1st Qu.:2.000 1st Qu.:4.000 1st Qu.:2.000
Median :7.000 Median :3.000 Median :5.000 Median :3.000
Mean :6.588 Mean :3.114 Mean :4.772 Mean :3.202
3rd Qu.:7.000 3rd Qu.:4.000 3rd Qu.:6.000 3rd Qu.:4.000
Max. :7.000 Max. :6.000 Max. :7.000 Max. :7.000
profile_14 profile_15 profile_16 profile_17 profile_18
Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.000 Min. :1.00
1st Qu.:1.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.000 1st Qu.:2.00
Median :2.000 Median :3.000 Median :4.000 Median :3.000 Median :3.00
Mean :2.351 Mean :3.158 Mean :3.693 Mean :3.386 Mean :3.14
3rd Qu.:3.000 3rd Qu.:4.000 3rd Qu.:5.000 3rd Qu.:4.000 3rd Qu.:4.00
Max. :5.000 Max. :7.000 Max. :7.000 Max. :7.000 Max. :7.00
Stage 5: Data analysis
For tablet computers, a relevant set of product attributes might be:
Screen size (inches)
Cellular network connectivity
Price
Battery life (hrs)
Operating system (OS)
Stage 5: Data analysis
For tablet computers, a relevant set of product attributes might be:
Attribute
1
2
3
4
5
Level
Screen (in)
Cell
Price ($)
Battery (hr)
OS
1
7
N
100
4
Android
2
10
Y
300
8
iOS
3
500
12
Windows
Stage 5: Data analysis
For tablet computers, a relevant set of product combinations might be:
Screen Cell Price Battery OS
1 7 Y 300 12 Windows
2 7 Y 100 8 Windows
3 10 Y 500 12 Android
4 7 Y 300 4 Android
5 7 N 300 8 iOS
6 10 N 300 12 Windows
7 7 N 500 12 Android
8 10 N 300 8 Android
9 7 N 500 4 Windows
10 10 Y 100 12 iOS
11 10 Y 300 4 iOS
12 7 N 100 12 iOS
13 10 Y 500 8 Windows
14 10 N 500 4 iOS
15 10 N 100 4 Windows
16 10 N 100 8 Android
17 7 Y 500 8 iOS
18 7 Y 100 4 Android
Stage 5: Data analysis
Let’s say we have a customer who responds to those profiles like this:
Baseline profile corresponds to all omitted attribute levels
Part-worths measure incremental utility relative to the baseline
Part-worths of included attribute levels = regression coefficients
Part-worths of omitted attribute levels = 0 by definition
Stage 5: Data analysis
Ratings Data
What is the baseline profile?
The baseline profile is the profile associated with the omitted factor levels from the dummy variable coding. In this case, the baseline profile is a 7” screen, no cellular connectivity, $100 price, 4 hour battery life, and Android OS.
What is the expected utility (rating) for the baseline profile?
The utility/rating for this profile is given by the intercept estimate.
4.3472
Interpret the coefficients from the regression
Dummy variable coefficients represent in the incremental utility (rating), relative to the baseline profile
e.g., the coefficient on factor(Screen)10 represents the incremental utility of changing from a 7” screen to a 10” screen
Step 5: Data analysis
Generate similar estimates for each individual represented in our supplemental data, responses_DF
How?
Create an empty list-of-lists to hold the regression results for each individual
Create a dataframe that combines design_DF with the responses for individual i.
Estimate a linear model with individual i’s responses as the dependent variable, and the columns of design_DF as (factor) regressors
Store the linear model results to the i’th element of lm_res.
To access or assign (top level) lists in a list-of-lists, we must use double bracket indexing, as in: lm_res[[i]]
Step 5: Data analysis
## estimate regression models, 1 per individual# initialize an empty "list of lists" to hold lm() regression resultslm_res <-vector(mode="list", length=nrow(responses_DF))# loop over subjectsfor (i in1:nrow(responses_DF)) {# get survey reponses for subject i (dropping respondent_id) response =as.numeric(responses_DF[i,2:ncol(responses_DF)])# create "estimation" dataframe, including i's reponses (Y) and the design variables (X) est_DF =cbind(design_DF, response=response)# run regression for subject i, store as lm_res[[i]]# note use of double bracket indexing [[ ]] syntax to access top-level lists lm_res[[i]] =lm(response ~factor(Screen) +factor(Cell) +factor(Price) +factor(Battery) +factor(OS),data=est_DF) }
Stage 6: Simulate market outcomes
Consider a hypothetical market
Start with closest approximation to current competitive landscape
Add 1st new product under consideration for introduction
Compute expected utility for each of the products, for each subject in the sample
Predict market shares (or units sold) for all products:
Assume subjects choose product with the highest utility
Product share = fraction of subjects who choose that product
Repeat steps above for remaining new product designs
Introduce product concept with highest expected profit/market share
Stage 6
Conjoint allows us to evaluate “what if” scenarios
How would a hypothetical “new” product would fare in competition with existing products?
In the context of our example
How a new product by Toshiba will compete against the existing iPad
Assume the existing “iPad” product corresponds to:
Screen = 10 (inches)
Cell = “Y” (has cell connectivity)
Price = 500 ($)
Battery = 8 (hrs)
OS = “iOS”
Stage 6
Assume (initially) that Toshiba is considering one potential product design, Toshiba_A
The profit from Toshiba_B is higher ($9635 vs $7350), so Toshiba should go with product B
Stage 6
Search over all Android OS alternative products
The trick here is first to enumerate all possible Android-based tablet designs. The expand.grid() function is useful for this purpose
After enumerating all possible Android-based tablet designs, loop over the designs and compute profits as before. Store the profit values in a list.
Find the element of the profit list with the highest profit. Use the index value for this profile to print the associated product attribute levels.
Stage 6
# create dataframe will all combinations of screen/cell/price/batteryallprods_DF <-expand.grid(unique(design_DF$Screen),unique(design_DF$Cell),unique(design_DF$Price),unique(design_DF$Battery))colnames(allprods_DF) <-c("Screen","Cell","Price","Battery")
Stage 6
Np <-nrow(allprods_DF) # number of products to test# calculate profits for each candidate product, assuming iPad competitionpft <-rep(0,Np)for (i in1:Np) { prods =data.frame(Screen=c(allprods_DF[i,"Screen"],10),Cell=c(as.character(allprods_DF[i,"Cell"]),"Y"),Price=c(allprods_DF[i,"Price"],500),Battery=c(allprods_DF[i,"Battery"],8),OS=c("Android","iOS")) prods.results <-tablet_pft1(lm_res, prods) pft[i] <- prods.results$profit}
Stage 6
# profitmax(pft)
[1] 15015
# change from Toshiba_Bmax(pft)-prods2.results$profit